3574 results found.
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
4 GByte Production Status:
Existing-used
Use:
-
Paper title:A Visually-Grounded Parallel Corpus with Phrase-to-Region Linking
-
Paper track:Multimodality/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hideki Nakayama | Flickr30k Entities | /N |
Documentation:
None
Multimodal/Multimedia
Corpus,
Language Type:
Bilingual
Languages:
English Japanese
Availability:
Freely Available
License:
CreativeCommons Attribution-ShareAlike
Size:
4 GByte Production Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:A Visually-Grounded Parallel Corpus with Phrase-to-Region Linking
-
Paper track:Multimodality/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hideki Nakayama | Flickr30k Entities JP | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution Share-Alike 3.0
Size:
3.3 MByte Production Status:
Newly created-finished
Use:
Information Extraction, Information Retrieval
-
Paper title:The STEM-ECR Dataset: Grounding Scientific Entity References in STEM Scholarly Content to Authoritative Encyclopedic and Lexicographic Sources
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Jennifer D'Souza | STEM-ECR | /N |
Documentation:
Yes, there are annotation guidelines. In English. Yes, it is.
Written
Lexicon,
Language Type:
Monolingual
Languages:
English
Availability:
Freely available for academic purposes
License:
ELRA
Size:
6470 entries Production Status:
Newly created-finished
Use:
Opinion Mining/Sentiment Analysis
-
Paper title:Design and Evaluation of SentiEcon: a fine-grained Economic/Financial Sentiment Lexicon from a Corpus of Business News
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Antonio Moreno-Ortiz | SentiEcon | /N |
Documentation:
Submitted publication
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
8.1M words Production Status:
Existing-used
Use:
Language Modelling
-
Paper title:On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yerai Doval | UMBC | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English Farsi Finnish German Italian Russian Spanish
Availability:
Freely Available
License:
Size:
35.3M words Production Status:
Existing-used
Use:
Language Modelling
-
Paper title:On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yerai Doval | Wikipedia Polyglot | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English Farsi Finnish German Italian Russian Spanish
Availability:
Freely Available
License:
Size:
15.4M words Production Status:
Existing-used
Use:
Language Modelling
-
Paper title:On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yerai Doval | Twitter (2015-2018) | /N |
Documentation:
None
Written
Word embedding model,
Language Type:
Multilingual
Languages:
English Farsi Finnish German Italian Japanese Spanish
Availability:
License:
Size:
None Production Status:
Existing-used
Use:
Language Modelling
-
Paper title:On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yerai Doval | Twitter embeddings (Camacho-Collados) | /N |
Documentation:
None
Written
Lexicon,
Language Type:
Multilingual
Languages:
English Farsi... Finnish German Italian Russian Spanish
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Language Modelling
-
Paper title:On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yerai Doval | Bilingual dictionaries (MUSE) | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Not yet available. Will be published in LDC's catalog after evaluations using data are complete.
License:
LDC
Size:
36 GByte Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:The SAFE-T Corpus: A New Resource for Simulated Public Safety Communications
-
Paper track:Speech/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Dana Delgado | SAFE-T Corpus | /N |
Documentation:
Yes. English. The documentation will be included when the corpus in released in LDC's catalog.




